Module 6 - Programming Assignment

This is the notebook for the Module 6 Programming Assignment.

Here are a few tips for using the iPython HTML notebook:

You can use tab . Try le<<tab> and see the available functions.
You can change the type of cell by picking "Code" or "Markdown" from the menu at the left.
If you keep typing in a Markdown text area, you will eventually get scroll bars. To prevent this, hit return when you come to the end of the window. Only a double return creates a new paragraph.
You can find out more about Markdown text here: http://daringfireball.net/projects/markdown/ (Copy this link and put it in another tab for reference--Don't click it or you'll leave your notebook!).
Every so often, restart the kernel, clear all output and run all code cells so you can be certain that you didn't define something out of order.

You should rename this notebook to be <your JHED id>_PR6.ipynb Do it right now.

Make certain the entire notebook executes before you submit it. (See Hint #5 above)

Change the following variables:



In [1]:

    
name = "Shehzad Qureshi"
jhed_id = "squresh6"
if name == "Student Name" or jhed_id == "sname1":
    raise Exception( "Change the name and/or JHED ID...preferrably to yours.")

Add whatever additional imports you require here. Stick with the standard libraries and those required by the class. The import gives you access to these functions: http://ipython.org/ipython-doc/stable/api/generated/IPython.core.display.html (Copy this link) Which, among other things, will permit you to display HTML as the result of evaluated code (see HTML() or display_html()).



In [2]:

    
from IPython.core.display import *
from StringIO import StringIO

For this assignment you're going to implement the unification algorithm. You will need this for a future module.

Here is imperative pseudocode for the algorithm:

def unify( exp1, exp2):

# base cases
if exp1 and exp2 are constants or the empty list:
    if exp1 = exp2 then return {}
    else return FAIL
if exp1 is a variable:
    if exp1 occurs in exp2 then return FAIL
    else return {exp1/exp2}
if exp2 is a variable:
    if exp2 occurs in exp1 then return FAIL
    else return {exp2/exp1}

# induction
first1 = first element of exp1
first2 = first element of exp2
result1 = unify( first1, first2)
if result1 = FAIL then return FAIL
apply result1 to rest of exp1 and exp2
result2 = unify( rest of exp1, rest of exp2)
if result2 = FAIL then return FAIL
return composition of result1 and result2

</code>

unify can return fail (use None), {} (the empty substitution list) or a substitution list. This sometimes confuses people..."Sam" unifying with "Sam" is not a failure so you return {} because there were no variables so there were no substitutions.

Use a very simple approach...let variables be Strings that start with "?" and let constants be all other Strings. You can write functions that identify whether a String is a variable or constant. An expression can be a List. In this regard, it might be useful to know that:

type( "a") <type 'str'> type( "a") == str True type( "a") == list False type( ["a"]) == list True

None is Python's version of null or nil in other languages and can be used for fail.

You can represent an expression as a List of expressions, variables and constants:

likes(Bill, Sam) father( ?x, Sam)

can be written as:

["likes" "Bill" "Sam"] ["father" "?x" "Sam"].

these can, of course, be nested. These means that all that is expected of you in your examples is to write:

unify( ["likes", "Bill", "Sam"], ["likes", "?x", "Sam"])

and not:

unify( "likes( Bill, Sam)", "likes( ?x, Sam)")

unless...you may, for extra credit, implement a simple parser so that you can write the last line for your examples.

One easy way to do this is to use an S-expression syntax (like Lisp):

unify( parse( "(likes Bill ?x)"), parse( "(likes ?y Sam)"))

However, make sure you have the assignment working before embarking on the extra credit. Show 10 examples of the unify() algorithm being called. 5 with it succeeding and 5 with it failing. They should all be syntactically different (that is, I shouldn't be able to substitute names, for example, "hate" for "like" and have it be the same example).

I will test your code with my own examples.

Put all of your functions, objects and documentation below here.

Unification

Let's start by defining helper functions to determine if statements are a constant, variable, or expression. Constants are simple string literals, variables are string literals that begin with "?" and expressions are lists of constants and variables.



In [3]:

    
def is_constant(x):
    return type(x) == str and not x.startswith("?")



In [4]:

    
def is_variable(x):
    return type(x) == str and x.startswith("?")



In [5]:

    
def is_list(x):
    return type(x) == list

Now for some basic unit-testing to ensure that it works.



In [6]:

    
print is_constant("Sam") == True
print is_constant("?x") == False
print is_constant(["likes", "Bill", "Sam"]) == False
print is_variable("Sam") == False
print is_variable("?x") == True
print is_variable(["likes", "Bill", "Sam"]) == False
print is_list("Sam") == False
print is_list("?x") == False
print is_list(["likes","Bill","Sam"]) == True









    



True
True
True
True
True
True
True
True
True

We need to be able to parse the variable name from the string. This is as simple as removing the "?" from the beginning. We need to do the reverse as well, i.e. make a variable string from a variable.



In [7]:

    
def get_varname(var):
    return var[1:]



In [8]:

    
def make_var(var):
    return "?{0}".format(var)



In [9]:

    
print get_varname("?x") == "x"
print get_varname("?son") == "son"
print make_var("x") == "?x"
print make_var("son") == "?son"









    



True
True
True
True

We need to be able to get the value of a statement. We need to take into account that some values may be of the form "F(A)" as well as the normal "Fred", etc.



In [10]:

    
def get_value(val):
    if is_list(val):
        lval, rval = val[0], val[1:]
        if is_list(rval[0]): return "{0}({1})".format(lval, get_value(rval[0]))
        else: return "{0}({1})".format(lval, ",".join(rval))
    else:
        return val



In [11]:

    
print get_value("Fred") == "Fred"
print get_value(["son", "Barney"]) == "son(Barney)"
print get_value(["P", ["F", "C"]]) == "P(F(C))"









    



True
True
True

Now we need to format our output substitution.



In [12]:

    
def pair(x, y):
    var = get_varname(x)
    val = get_value(y)
    return "{0}/{1}".format(var, val)

And another function to test if a variable occurs in another expression.



In [13]:

    
def occurs(x, y):
    return True if x in y else False

We will need a way to extract a variable and value from a formatted substitution.



In [14]:

    
def parse_substitution(sub):
    var = sub.split("/")
    return (var[0], var[1])

Now we're getting into the heart of the algorithm.

This function will propagate a substitution into an expression. If the expression is a compound expression, it recurses.



In [15]:

    
def propagate(x, sub):
    if len(sub) == 0 or len(x) == 0: return x
    var, value = parse_substitution(sub[0])
    for i, xval in enumerate(x):
        if is_list(xval): x[i] = propagate(xval, sub)
        elif make_var(var) in xval:
            x[i] = value
    return x

This is a helper function to specifically unify a variable and another expression. A check was added to ensure that the variable was not already in the output.



In [16]:

    
def unify_variable(x, y, th):
    if occurs(x, y): return None
    else:
        if pair(x, y) not in th: th.append(pair(x, y))
        return th

This is going to be the main worker function. The order of processing is as follows:

Checks for failure
Checks for empty expressions
Checks for matching constants
Processes variables
Processes compound expressions by recursion as necessary



In [17]:

    
def unify_expression(x, y, th):
    if th is None: return th
    elif is_list(x) and is_list(y) and len(x) == 0 and len(y) == 0: return th if x == y else None
    elif is_constant(x) and is_constant(y): return th if x == y else None
    elif is_variable(x): return unify_variable(x, y, th)
    elif is_variable(y): return unify_variable(y, x, th)
    
    result1 = unify(x[0], y[0])
    if result1 is None: return None
    if len(result1) > 0: x[1:], y[1:] = propagate(x[1:], result1), propagate(y[1:], result1)
    result2 = unify(x[1:], y[1:])
    if result2 is None: return None
    return result1 + result2

This is our entry point into the unification algorithm. Had to make this separate from the function above because IPython was acting weird by saving the output of th for every instance. Even for different cells run later.



In [18]:

    
def unify(x, y):
    th = list()
    return unify_expression(x, y, th)

Let's start with some basic testing of constants and variables.



In [19]:

    
print unify("Sam", "Sam") == []
print unify("Sam", "Bill") == None
print unify("?x", "Bill") == ["x/Bill"]
print unify("Bill", "?x") == ["x/Bill"]
print unify("?x", "?x") == None









    



True
True
True
True
True

Coding for compound expressions requires a bit more work, so here's a simple case to start with.



In [20]:

    
print unify(["likes", "Bill", "Sam"], ["likes", "?x", "Sam"]) == ["x/Bill"]









    



True

Now that it works, here are the examples listed from Professor Butcher's Blackboard post.

Note that P(A, B) is translated to ["P", "A", "B"] for input.

Likewise, P(A, F(B)) is translated to ["P", "A", ["F", "B"]]



In [21]:

    
print unify(["P", "A", "B"], ["P", "A", "B"]) == []
print unify(["P", "A", "?x"], ["P", "A", "B"]) == ["x/B"]
print unify(["P", "A", "B"], ["P", "B", "C"]) == None
print unify(["P", "A", ["F", "B"]], ["P", "A", ["F", "B"]]) == []
print unify(["P", "A", ["F", "B"]], ["P", "A", ["K", "B"]]) == None
print unify(["P", "A", ["F", "B"]], ["P", "A", ["F", "C"]]) == None
print unify(["P", "A", "?x"], ["P", "A", ["F", "C"]]) == ["x/F(C)"]
print unify(["P", "A", "B"], ["P", "A", ["F", "C"]]) == None
print unify(["P", "A", "?x"], ["P", "?y", ["F", "C"]]) == ["y/A", "x/F(C)"]
print unify(["P", "A", "?x"], ["P", "?y", ["F", "?y"]]) == ["y/A", "x/F(A)"]
print unify(["P", "A", "?x"], ["P", "?y", ["F", "?x"]]) == None
print unify(["P", "A", "B"], ["?x", "A", "B"]) == ["x/P"]









    



True
True
True
True
True
True
True
True
True
True
True
True

Here are more examples from the self-check.



In [22]:

    
print unify("Fred", "Barney") == None
print unify("Pebbles", "Pebbles") == []
print unify(["quarry-worker", "Fred"], ["quarry-worker", "?x"]) == ["x/Fred"]
print unify(["son", "Barney", "?x"] , ["son", "?y", "Bam-Bam"]) == ["y/Barney", "x/Bam-Bam"]
print unify(["married", "?x", "?y"], ["married", "Barney", "Wilma"]) == ["x/Barney", "y/Wilma"]
print unify(["son", "Barney", "?x"], ["son", "?y", ["son", "Barney"]]) == ["y/Barney", "x/son(Barney)"]
print unify(["son", "Barney", "?x"], ["son", "?y", ["son", "?y"]]) == ["y/Barney", "x/son(Barney)"]
print unify(["son", "Barney", "Bam-Bam"], ["son", "?y", ["son", "Barney"]]) == None
print unify(["loves", "Fred", "Fred"], ["loves", "?x", "?x"]) == ["x/Fred"]
print unify(["future", "George", "Fred"], ["future", "?y", "?y"]) == None









    



True
True
True
True
True
True
True
True
True
True

Summary

Unification is the process of finding a substitution between two logical expressions in order to make them identical, a necessary part of inference in first order logic. Substitutions usually take the form of {variable/constant} where a variable can be substituted for any constant. Unification is part of both forward and backward chaining algorithms used for inference in first order logic.

One of the problems with pseudocode is that it is very non-specific. But that is the point of pseudocode isn't it... I spent a lot of time thinking what "apply result1 to rest of exp1 and exp2" meant. The textbook had a similar issue: it called UNIFY-VAR with three variables but was referencing four. Where did val come from? I couldn't figure it out. For the above pseudocode I couldn't figure out what we were applying to the remained of the expressions. In the end and after a lot of debug code I managed to figure it out... the point was to do some sort of forward substitution.

This algorithm wasn't as hard as I anticipated it to be once I figured out the quirks. I spent a few hours studying it and running various scenarios through my head to see what it entailed and after running many test cases figured out what it was supposed to do. I am curious as to how this ties into planning next week.

Let's try to build an S-expression parser. Supposedly this is used in Lisp. An example is shown below:

unify(parse("(likes Bill ?x)"), parse("(likes ?y Sam)"))

I will assume that the output will be a list, similar to the requirements above.



In [23]:

    
def parse_compound(expr):
    vals = expr[:-1].split("(", 1)
    for i, val in enumerate(vals):
        if "(" in val and ")" in val:
            vals[i] = parse_compound(vals[i])
    return vals



In [24]:

    
def parse(expr):
    svalues = expr[1:-1].split(" ")
    for i, val in enumerate(svalues):
        if "(" in val and ")" in val:
            svalues[i] = parse_compound(val)
    return svalues



In [25]:

    
print parse("(likes Bill ?x)")
print parse("(likes ?y Sam)")
print parse("(likes ?x son(Sam))")
print parse("(likes ?x son(brother(Sam)))")
print parse("(son ?y son(Barney))")









    



['likes', 'Bill', '?x']
['likes', '?y', 'Sam']
['likes', '?x', ['son', 'Sam']]
['likes', '?x', ['son', ['brother', 'Sam']]]
['son', '?y', ['son', 'Barney']]

Looks good so far. Let's see if we can run our tests again using the new parser.



In [26]:

    
print unify(parse("(P A B)"), parse("(P A B)")) == []
print unify(parse("(P A ?x)"), parse("(P A B)")) == ["x/B"]
print unify(parse("(P A B)"), parse("(P B C)")) == None
print unify(parse("(P A F(B))"), parse("(P A F(B))")) == []
print unify(parse("(P A F(B))"), parse("(P A K(B))")) == None
print unify(parse("(P A F(B))"), parse("(P A F(C))")) == None
print unify(parse("(P A ?x)"), parse("(P A F(C))")) == ["x/F(C)"]
print unify(parse("(P A B)"), parse("(P A F(C))")) == None
print unify(parse("(P A ?x)"), parse("(P ?y F(C))")) == ["y/A", "x/F(C)"]
print unify(parse("(P A ?x)"), parse("(P ?y F(?y))")) == ["y/A", "x/F(A)"]
print unify(parse("(P A ?x)"), parse("(P ?y F(?x))")) == None
print unify(parse("(P A B)"), parse("(?x A B)")) == ["x/P"]
print unify(parse("(P A ?x)"), parse("(P A F(S(C)))")) == ["x/F(S(C))"]









    



True
True
True
True
True
True
True
True
True
True
True
True
True



In [27]:

    
print unify(parse("(quarry-worker Fred)"), parse("(quarry-worker ?x)")) == ["x/Fred"]
print unify(parse("(son Barney ?x)"), parse("(son ?y Bam-Bam)")) == ["y/Barney", "x/Bam-Bam"]
print unify(parse("(married ?x ?y)"), parse("(married Barney Wilma)")) == ["x/Barney", "y/Wilma"]
print unify(parse("(son Barney ?x)"), parse("(son ?y son(Barney))")) == ["y/Barney", "x/son(Barney)"]
print unify(parse("(son Barney ?x)"), parse("(son ?y son(?y))")) == ["y/Barney", "x/son(Barney)"]
print unify(parse("(son Barney Bam-Bam)"), parse("(son ?y son(Barney))")) == None
print unify(parse("(loves Fred Fred)"), parse("(loves ?x ?x)")) == ["x/Fred"]
print unify(parse("(future George Fred)"), parse("future ?y ?y")) == None









    



True
True
True
True
True
True
True
True



In [27]: